Speaker Separation Using Speaker Inventories and Estimated Speech
نویسندگان
چکیده
We propose speaker separation using inventories and estimated speech (SSUSIES), a framework leveraging profiles for separation. SSUSIES contains two methods, (SSUSI) (SSUES). SSUSI performs with the help of inventory. By combining advantages permutation invariant training (PIT) extraction, significantly outperforms conventional approaches. SSUES is widely applicable technique that can substantially improve performance output first-pass evaluate models on both recognition metrics.
منابع مشابه
Speech separation using speaker-adapted eigenvoice speech models
We present a system for model-based source separation for use on single channel speech mixtures where the precise source characteristics are not known a priori. The sources are modeled using hidden Markov models (HMM) and separated using factorial HMM methods. Without prior speaker models for the sources in the mixture it is difficult to exactly resolve the individual sources because there is n...
متن کاملSpeaker separation using visual speech features and single-channel audio
This work proposes a method of single-channel speaker separation that uses visual speech information to extract a target speaker’s speech from a mixture of speakers. The method requires a single audio input and visual features extracted from the mouth region of each speaker in the mixture. The visual information from speakers is used to create a visually-derived Wiener filter. The Wiener filter...
متن کاملSpeaker Independent Speech Recognition Using Hidden Markov Models for Persian Isolated Words
متن کامل
Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Verification Using Coded Speech
The implementation of a pseudo text-independent Speaker Verification system is described. This system was designed to use only information extracted directly from the coded parameters embedded in the ITU-T G.729 bitstream. Experiments were performed over the YOHO database [1]. The feature vector as a short-time representation of speech consists of 16 LPC-Cepstral coefficients, as well as residu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing
سال: 2021
ISSN: ['2329-9304', '2329-9290']
DOI: https://doi.org/10.1109/taslp.2020.3045556